Hierarchical Boosting for Gene Function Prediction
نویسندگان
چکیده
Functional classification of genes using diverse bio-molecular data obtained from high-throughput technologies is a fundamental problem in bioinformatics and functional genomics. Genes are organized and classified according to a hierarchical classification scheme and each gene will participate in multiple activities. Flat classifiers, that work on non-hierarchical classification problems independently, do not take into account the hierarchical structure of the functional class taxonomies. Therefore, they are not able to utilize the information inherent in the class hierarchy. Moreover, independent classifiers, where each classifier predicts the gene membership to a particular class, may lead to an inconsistent set of predictions for a hierarchically structured classification scheme. In this paper, we propose HMLBoosting algorithm for the problem of hierarchical multi-label classification in the context of gene function prediction. HML-Boosting exploits the hierarchical dependencies among the classes. Extensive experiments on four bio-molecular datasets using two approaches for class-membership inconsistency correction during the testing phase, the top-down approach and the bottom-up approach, show that HML-Boosting algorithm outperforms flat classifiers using different evaluation metrics. In addition, we carry out a detailed comparison of the two approaches for class-membership inconsistency correction during the testing phase.
منابع مشابه
An Extended Local Hierarchical Classifier for Prediction of Protein and Gene Functions
Gene function prediction and protein function prediction are complex classification problems where the functional classes are structured according to a predefined hierarchy. To solve these problems, we propose an extended local hierarchical Naive Bayes classifier, where a binary classifier is built for each class in the hierarchy. The extension to conventional local approaches is that each clas...
متن کاملMulti-Label Hierarchical Classification for Protein Function Prediction
Hierarchical classification is a problem with applications in many areas as protein function prediction where the dates are hierarchically structured. Therefore, it is necessary the development of algorithms able to induce hierarchical classification models. This paper presents experimenters using the algorithm for hierarchical classification called Multi-label Hierarchical Classification using...
متن کاملImproving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach
Hierarchical prosody structure generation is an important but challenging component for speech synthesis systems. In this paper, we investigate the use of enhanced embedding (joint learning of character and word embedding (CWE)) features and different model fusion approaches at both character and word level for Mandarin prosodic boundaries prediction. For CWE module, the internal structures of ...
متن کاملEvaluation of Data Mining Algorithms for Detection of Liver Disease
Background and Aim: The liver, as one of the largest internal organs in the body, is responsible for many vital functions including purifying and purifying blood, regulating the body's hormones, preserving glucose, and the body. Therefore, disruptions in the functioning of these problems will sometimes be irreparable. Early prediction of these diseases will help their early and effective treatm...
متن کاملTrying Predicting Protein Function Using Machine-learned Hierarchical Classifiers
High performance and accurate protein function prediction is a challenging problem in Bioinformatics. Many contemporary ontologies, such as Gene Ontology, have a hierarchical structure that can be exploited to improve the prediction accuracy, and lower the computational cost of protein function prediction. The structure of the hierarchy is leveraged in two ways: First, a novel method of creatin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010